fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729
fix(workflow-compiling-service): LogicalLink OperatorIdentity round-trip#5729wwong0 wants to merge 5 commits into
Conversation
…ound-trip
Prior to this change, `@JsonCreator` was placed on the `String`
convenience constructor in both `amber` and `workflow-compiling-service`
`LogicalLink`. Jackson serialises `OperatorIdentity` as an object
(`{"id":"op-A"}`), so any round-trip through `writeValueAsString` /
`readValue` failed with `MismatchedInputException` because the String
constructor cannot accept an object node.
Fix: add a private `readOperatorIdentity(JsonNode, String)` helper to
the companion object and move `@JsonCreator` to a new `JsonNode`
constructor that delegates to it. The helper handles the string shape
(front-end input), the object shape (serialised form), null/absent
fields, and rejects non-text / non-object nodes with
`IllegalArgumentException`.
The new `workflow-compiling-service` `LogicalLinkSpec` adds direct unit
coverage of the `readOperatorIdentity` branches that the existing
HTTP-layer-only `WorkflowCompilationResourceSpec` did not reach. It also
pins the leniency contract (no `require` guards) and the Jackson
annotation contract (`@JsonProperty` key pinning, round-trip fidelity).
The `amber` `LogicalLinkSpec` is updated to match the renamed constructor
section, drop the now-invalid MismatchedInputException expectation, and add
the round-trip test.
Closes apache#5042
|
👋 Thanks for your first contribution to Texera, @wwong0! If you're looking for a good place to start, browse issues labeled You can drive common housekeeping yourself by commenting one of these commands on its own line:
Each command must match exactly: |
There was a problem hiding this comment.
Pull request overview
This PR fixes Jackson (de)serialization for LogicalLink so OperatorIdentity fields can round-trip through objectMapper.writeValueAsString → readValue in both the Amber engine and workflow-compiling-service, while preserving each module’s strict vs. lenient validation behavior.
Changes:
- Move
@JsonCreatoroff the String convenience constructor and introduce aJsonNode-based creator that can read both string-shaped and object-shaped operator ids. - Add a
readOperatorIdentity(node, fieldName)helper to parsefromOpId/toOpIdfrom either JSON shape and centralize error handling. - Add/adjust unit tests to cover the round-trip behavior and module-specific strictness/leniency contracts.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| workflow-compiling-service/src/main/scala/org/apache/texera/amber/compiler/model/LogicalLink.scala | Adds JsonNode @JsonCreator + helper to support both string/object OperatorIdentity JSON shapes. |
| workflow-compiling-service/src/test/scala/org/apache/texera/amber/compiler/model/LogicalLinkSpec.scala | New spec covering leniency contract and Jackson round-trip behavior for compiler-service LogicalLink. |
| amber/src/main/scala/org/apache/texera/workflow/LogicalLink.scala | Mirrors the JsonNode @JsonCreator + helper in the Amber engine LogicalLink (while keeping require guards). |
| amber/src/test/scala/org/apache/texera/workflow/LogicalLinkSpec.scala | Updates characterization tests to assert successful round-trip after the fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5729 +/- ##
============================================
- Coverage 59.14% 58.51% -0.63%
- Complexity 3201 3212 +11
============================================
Files 1132 1145 +13
Lines 43681 44435 +754
Branches 4734 4817 +83
============================================
+ Hits 25834 26001 +167
- Misses 16416 16999 +583
- Partials 1431 1435 +4
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
| config | throughput | MB/s | latency | max Δ latest / 7d | |
|---|---|---|---|---|---|
| 🔴 | bs=10 sw=10 sl=64 | 380 | 0.232 | 26,753/32,258/32,258 us | 🔴 +9.7% / 🔴 +113.0% |
| ⚪ | bs=100 sw=10 sl=64 | 807 | 0.492 | 123,218/149,147/149,147 us | ⚪ within ±5% / 🔴 +38.2% |
| ⚪ | bs=1000 sw=10 sl=64 | 929 | 0.567 | 1,080,028/1,113,562/1,113,562 us | ⚪ within ±5% / 🔴 +9.1% |
Baseline details
Latest main 6de8a48 from same runner
| config | metric | PR | latest main | 7d avg | Δ latest | Δ 7d |
|---|---|---|---|---|---|---|
| bs=10 sw=10 sl=64 | throughput | 380 tuples/sec | 407 tuples/sec | 776.36 tuples/sec | -6.6% | -51.1% |
| bs=10 sw=10 sl=64 | MB/s | 0.232 MB/s | 0.248 MB/s | 0.474 MB/s | -6.5% | -51.0% |
| bs=10 sw=10 sl=64 | p50 | 26,753 us | 24,382 us | 12,636 us | +9.7% | +111.7% |
| bs=10 sw=10 sl=64 | p95 | 32,258 us | 32,891 us | 15,143 us | -1.9% | +113.0% |
| bs=10 sw=10 sl=64 | p99 | 32,258 us | 32,891 us | 18,954 us | -1.9% | +70.2% |
| bs=100 sw=10 sl=64 | throughput | 807 tuples/sec | 833 tuples/sec | 985.33 tuples/sec | -3.1% | -18.1% |
| bs=100 sw=10 sl=64 | MB/s | 0.492 MB/s | 0.509 MB/s | 0.601 MB/s | -3.3% | -18.2% |
| bs=100 sw=10 sl=64 | p50 | 123,218 us | 118,590 us | 101,671 us | +3.9% | +21.2% |
| bs=100 sw=10 sl=64 | p95 | 149,147 us | 143,337 us | 107,939 us | +4.1% | +38.2% |
| bs=100 sw=10 sl=64 | p99 | 149,147 us | 143,337 us | 113,798 us | +4.1% | +31.1% |
| bs=1000 sw=10 sl=64 | throughput | 929 tuples/sec | 943 tuples/sec | 1,016 tuples/sec | -1.5% | -8.6% |
| bs=1000 sw=10 sl=64 | MB/s | 0.567 MB/s | 0.576 MB/s | 0.62 MB/s | -1.6% | -8.6% |
| bs=1000 sw=10 sl=64 | p50 | 1,080,028 us | 1,053,138 us | 989,693 us | +2.6% | +9.1% |
| bs=1000 sw=10 sl=64 | p95 | 1,113,562 us | 1,150,346 us | 1,028,327 us | -3.2% | +8.3% |
| bs=1000 sw=10 sl=64 | p99 | 1,113,562 us | 1,150,346 us | 1,059,969 us | -3.2% | +5.1% |
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,526.01,200,128000,380,0.232,26752.92,32257.87,32257.87
1,100,10,64,20,2479.22,2000,1280000,807,0.492,123217.81,149147.35,149147.35
2,1000,10,64,20,21527.33,20000,12800000,929,0.567,1080028.40,1113562.02,1113562.02…d in LogicalLink
Address review feedback on the LogicalLink `@JsonCreator` fix: the object-shape branch of `readOperatorIdentity` previously coerced a non-textual `id` (e.g. `{"id": 123}`) via `asText()`, silently turning it into `"123"`—inconsistent with the helper's contract of rejecting malformed nodes, and with the existing rejection of a non-text/non-object top-level node. Require `id` to be textual (or null/absent); otherwise throw `IllegalArgumentException`. Applied to both the amber engine and the workflow-compiling-service `LogicalLink` (identical helpers).
Extend both `LogicalLinkSpec` suites with branch-coverage tests for `readOperatorIdentity`: object-shape `id` missing, explicit null, numeric/non-textual `id`, and a top-level explicit-null op id. These pin the new rejection behavior.
…k-operatoridentity-round-trip
|
Hi @wwong0, thanks for the PR. overall it looks good. can you please make coverage higher, before I review details? you can find coverage report here #5729 (comment). thanks. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: William Wong <120128836+wwong0@users.noreply.github.com>
|
Hi @Yicong-Huang the CodeCov report updated after your comment, it now shows higher coverage. |
Automated Reviewer SuggestionsBased on the
|
What changes were proposed in this PR?
Bug fix:
LogicalLink@JsonCreatorconstructor (amberandworkflow-compiling-service)@JsonCreatorwas previously placed on theStringconvenience constructor ofLogicalLinkin both modules.OperatorIdentityis a case class that Jackson serializes as an object ({"id":"op-A"}), not a plain string. When reading back a serializedLogicalLink, Jackson dispatched to the@JsonCreatorString constructor but could not coerce the{"id":"op-A"}object node to aString, causing anywriteValueAsString→readValueround-trip to fail withMismatchedInputException.The fix introduces a private
readOperatorIdentity(node: JsonNode, fieldName: String)helper in the companion object and moves@JsonCreatorto a newJsonNodeconstructor that delegates to it. The helper accepts both the plain-string shape (front-end input) and the object shape (serialized form), maps null/absent ids toOperatorIdentity(null), and rejects malformed nodes. TheStringconvenience constructor is retained but no longer carries@JsonCreator.Any related issues, documentation, discussions?
Closes #5042
This PR continues and adopts work from #5175
How was this PR tested?
The new
workflow-compiling-serviceLogicalLinkSpecadds unit coverage ofLogicalLinkandreadOperatorIdentitymodel-level logic that existing tests do not cover and pins the leniency contract (norequireguards in the compiler-service variant).The existing
amberLogicalLinkSpecwas updated to match the renamed constructor section, drop the now-invalidMismatchedInputExceptionexpectation, and add a passing round-trip test.Run with:
Result: 18/18 tests pass
Result: 15/15 tests pass
Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Sonnet 4.6